Context-sensitive intra-class clustering
نویسندگان
چکیده
This paper describes a new semi-supervised learning algorithm for intra-class clustering (ICC). ICC partitions each class into sub-classes in order to minimize overlap across clusters from different classes. This is achieved by allowing partitioning of a certain class to be assisted by data points from other classes in a context-dependent fashion. The result is that overlap across sub-classes (both withinand across class) is greatly reduced. ICC is particularly useful when combined with algorithms that assume that each class has a unimodal Gaussian distribution (e.g., Linear Discriminant Analysis (LDA), quadratic classifiers), an assumption that is not always true in many real-world situations. ICC can help partition non-Gaussian, multimodal distributions to overcome such a problem. In this sense, ICC works as a preprocessor. Experiments with our ICC algorithm on synthetic data sets and real-world data sets indicated that it can significantly improve the performance of LDA and quadratic classifiers. We expect our approach to be applicable to a broader class of pattern recognition problems where class-conditional densities are significantly non-Gaussian or multi-modal. 2013 Elsevier B.V. All rights reserved.
منابع مشابه
Preserving Class Discriminatory Information by Context-sensitive Intra-class Clustering Algorithm
Many powerful techniques in supervised learning (e.g. linear discriminant analysis, LDA, and quadratic classifier) assume that data in each class have a single Gaussian distribution. In reality, data in the class of interest, i.e., the object class, could have non-Gaussian distributions and could be isolated into several subgroups by the data from other classes (the context classes). To address...
متن کاملPrototypes Selection with Context Based Intra-class Clustering for Video Annotation with Mpeg7 Features
In this work, we analyze the effectiveness of perceptual features to automatically annotate video clips in domain-specific video digital libraries. Typically, automatic annotation is provided by computing the clip similarity with respect to pre-annotated examples, which constitute the knowledge base, in accordance with a given ontology or a classification scheme. The amount of training clips is...
متن کاملCAMAC: a context-aware mandatory access control model
Mandatory access control models have traditionally been employed as a robust security mechanism in multilevel security environments such as military domains. In traditional mandatory models, the security classes associated with entities are context-insensitive. However, context-sensitivity of security classes and flexibility of access control mechanisms may be required especially in pervasive c...
متن کاملHMM state clustering across allophone class boundaries
We present a novel approach to hidden Markov model (HMM) state clustering based on the use of broad phone classes and an allophone class entropy measure. Most state-of-the-art largevocabulary speech recognizers are based on context-dependent (CD) phone HMMs that use Gaussian mixture models for the state-conditioned observation densities. A common approach for robust HMM parameter estimation is ...
متن کاملClustering Analysis on E-commerce Transaction Based on K-means Clustering
Based on the density, increment and grid etc, shortcomings like the bad elasticity, weak handling ability of high-dimensional data, sensitive to time sequence of data, bad independence of parameters and weak handling ability of noise are usually existed in clustering algorithm when facing a large number of high-dimensional transaction data. Making experiments by sampling data samples of the 300...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Pattern Recognition Letters
دوره 37 شماره
صفحات -
تاریخ انتشار 2014